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COPYRIGHT NOTICE 

5 

A portion of the disclosure recited in the specification contains material 
which is subject to copyright protection. Specifically, source code instructions are 
included for a process by which the present invention is practiced in a computer system.. 
The copyright owner has no objection to the facsimile reproduction of the specification as 
10 filed in the Patent and Trademark Office. Otherwise all copyright rights are reserved. 



BACKGROUND OF THE INVENTION 
This invention relates in general to computer software and more 
specifically to a system for preventing accurate disassembly of computer programs. 
1 5 Computer software manufacturers have a keen interest in protecting their 

software. Software can be easily copied, in whole or in part, by making digital copies. 
Other forms of the copying do not require a competitor to copy the actual digital data, but 
are based on a knowledgeable programmer viewing the instructions within the software to 
gain information that can allow the programmer to "break" security systems, obtain 
20 valuable programming techniques or trade secrets of the software manufacturer, make 
derivations, manipulate the operation of the original code, etc. 

One barrier to copying computer software is that many forms of software 
are distributed in a format that is not easily decipherable, or readable, by a human. 

Figure IB is an illustration of various forms in the prior art which a 
25 computer program, or software, is transformed into during the process of creation, 
distribution, and ultimate execution of the software on a user's machine. 

In Fig. IB, human readable source code 10 is developed by a programmer 
who is the original author, and owner, of the work. Such source code is easily readable 
and understandable by a human programmer since the source code is vmtten in text that 
30 resembles plain English with mathematical and logical equations. Many different forms 
of source code exist today based on many different types of computer languages. 
"Assembly code" is a form of human-readable code that is closely tied to a specific 
microprocessor's instruction set. Assembly code has many similarities to source code in 
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terms of the form translations that the assembly code undergoes prior to being executed. 
For purposes of this specification, source code and assembly code can be treated 
similarly, and terminology and concepts associated with source code and assembly code 
can be interchanged. For example, as discussed below, compilation and assembly are 
5 analogous, as are decompilation and disassembly. 

Returning to Fig. IB, source code 10 is compiled by compiler 12. 
Compiler 12 is a software process that translates human-readable source code to a series 
of nimibers which is, for the most part, unreadable by humans. Source code 10 is thus 
transformed, or "compiled," by compiler 12 to form the human-unreadable object code. 

1 0 Object code 14 can be linked by linker 20 with other object code modules as illustrated by 
object code modules 16 and 18 in Fig. IB. Once the object code modules are linked by 
linker 20, they form executable program 22. Executable program 22 can be loaded by 
loader 24 into a user's computer to form executing image 26. Executing image 26 
represents the actual numerical information that is executed by a microprocessor within 

15 an end-user's computer. 

Note that all forms of source code 10 that exists after compilation by 
compiler 12 are, for the most part, unreadable by a human. In other words, object code 
modules 14, 16 and 18; executable program 22; and executing image 26 are basically 
unformatted conglomerations of numbers that are extremely difficult to understand. 

20 However, tools exist to decompile, or disassemble, these unreadable 

versions of source code. Decompiler 28 can accept the unformatted numbers of object 
code 14, executable program 22 or executing image 26 and produce a readable version of 
the original source code program. Such a readable version is referred to as decompiled 
(or disassembled) code 30. While the decompiled code is usually not as readable as 

25 original source code 10, it is a very effective tool for allowing an experienced 

programmer to understand the operation of the computer program and greatly reduces the 
amount of time required to copy, hack, or otherwise manipulate source code produced by 
an original programmer. 

Thus, it is desirable to produce an invention which prevents , or reduces 

30 the effectiveness of decompilation, or disassembly, of compiled or assembled code. 

SUMMARY OF THE INVENTION 
The present invention prevents disassembly of computer code. Such 
prevention includes hiding, masking, or otherwise "obfuscating," the original code. This 
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helps thwart unwanted parties from making copies of an original author's software, 
obtaining valuable information from the software for purposes of breaking into the 
program, stealing secrets, making derivative works, etc. The present invention uses 
special assembly-language instructions to confiise the disassembler to produce results that 
5 are not an accurate representation of the original assembly code. In one embodiment, a 
method is provided where an interrupt (typically a software interrupt) is used to mask 
some of the subsequent instructions. The instruction used can be any instruction that 
causes the disassembler to assume that one or more words subsequent to the instruction, 
are associated with the instruction. The method, instead, jumps directly to the bytes 

10 assumed associated with the instruction and executes those bytes to achieve the original 
functionality of the program. 

A preferred embodiment works with a popular Microsoft "ASM" 
assembler language and "DASM" disassembler. The instructions used to achieve the 
obfuscation include software interrupt, "INT," instructions. Using this approach, up to 17 

15 bytes of obfuscation can be achieved with five instructions. Each instruction remains 
obfuscated until executed and returns to an obfuscated state afterwards. 

In one embodiment, the invention provides a method for obfuscating 
computer program instructions upon disassembly, the method comprising inserting an 
obfuscating instruction or causing a disassembler to not disassemble one or more bytes 

20 subsequent to the obfuscating instruction; and inserting a branch instruction to invoke 
execution of the one or more b3^es subsequent to the obfiiscating instruction. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 A illustrates software instructions of the present invention; and 
25 Figure IB is an illustration of various forms in the prior art into which a 

computer program, or software, is transformed during the process of creation, 
distribution, and ultimate execution of the software on a user's machine. 



DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 
30 Fig. lA illustrates software instructions of the present invention. 

In Fig. 1 A, instructions at 100 illustrate the concept of code obfuscation. 
Such instructions are included within the body of an assembly language program. A 
larger portion of the program is illustrated by preceding assembly code 102 and 
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succeeding assembly code 104. Note that the obfuscating instruction, and associated 
instractions, can be inserted more than once within the program. 

The obfuscating instruction, and associated instructions, include 
obfuscating instruction 110, jump instruction 1 12 and hidden code 1 14. During execution 
5 of the assembly code, the assembly program operates as intended by the original 
programmer until jump instruction 1 12 is executed. When jump instruction 1 12 is 
executed then obfuscating instruction 1 10 is skipped and execution proceeds at hidden 
instructions 114. In other words, obfuscating instruction 1 10 is never executed. Hidden 
instructions 114 are part of the instructions written by the original programmer and, thus, 

10 are part of the original program. Only jump instruction 112 and obfuscating instruction 
110 need to be inserted into the original program. 

It should be apparent that the program will operate as originally intended 
with the exception that a few more cycles of processor time are required in order to 
perform the jump instruction 1 12. Also, a few more bytes of information are stored in the 

15 program every time the technique of the present invention is used to account for jump 
instruction 1 12 and obfuscating instruction (or instructions, as described below). The 
number of hidden instructions at 1 14 varies with the specific obfiiscating instruction, or 
instructions, employed, as is discussed in detail, below. 

Note that jump instruction 112 need not be immediately adjacent to 

20 obfuscating instruction 110. Any instruction that directs a processor to obtain the next 
instruction from within the "hidden" instructions 1 14 can be sufficient. Also, although 
the invention is discussed with respect to hidden instructions 1 14 being immediately 
adjacent to obfuscating instruction 110, it is possible that obfuscating instructions may act 
to hide non-adjacent instructions. 

25 The present invention is described with respect to assembly language code 

in "ASM" format. Such format is produced, for example, by the Microsoft VC++ 
compiler. It should be apparent that the techniques of the present invention can be 
adapted for any type of assembler, or source code, or other computer languages and 
syntax which provide a suitable obfuscation instruction. 

30 By obfuscating code in different places througout the program, it is much 

more difficult for a programmer to obtain useful information. The decompiler loses 
synchronization with the instructions and can display missing, or incorrect, instructions in 
place of the actual ones. With enough portions of the code obscured, a would-be hacker 
is required to trace through all the code, manually. The debugger (or disassembler) is 



expecting the code to return after a jump to a certain instruction, but the code changes the 
return location causing the debugger to break out of its gui. Two code examples are 
provided in Table I and Table II; 

call $+6 ;Highly efficient! 

5 DB OEBh 

add dword ptr [esp],6 

ret 



call $+12 ;Not efficient. 

DB 083h 

jmp $+10 

DB 08Bh 

Inc [esp] 

ret 



TABLE II 



20 A more advanced technique can involve randomly exchanging jump 

commands in the .ASM fde with 'tricky returns.' This requires pushing the destination 
address instead of altering the esp register like previous examples. This way, this 
(intelligent) obfuscation macro would not be competing against other macros. By placing 
the 'tricky returns' where there is already a jump, the byte overhead is reduced. 

25 The instruction "INT 35" has obfuscation properties. Unlike INT 20, no 

additional data is displayed. In fact, INT's 34-3 A or so have the same ability to totally 
mask three hyXQS. As an example: 



actual code the debugger window 

30 0 JMP 4 0 JMP 4 

2 INT 35h 2 INT 35h 

4 NOP 7 XOR EAX,EAX 

5 NOP 

6 NOP 
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7 XOR EAX,EAX 



Of course, as much as this is helpful, three bj^es of obfuscation is not all 
that impressive. In tandem with INT 20 though, it is an entirely other story. This 
5 example: 



jmp S+2 
INT 35h 
jmp $+2 
INT 20h 



yielded 14 bytes of obfuscation. Much better! But, then there is this fine example: 



jmp $+4 

15 INT35h 
INT 20h 



only six bytes long, but yielded an incredible 17 bytes of obfuscation over five 
instructions. Each instruction remains obfuscated until executed and returns to an 

20 obfuscated state afterwards. 

Below is some gibberish code that does a fake comparison, then it jumps 
into the second b5^e of the compare, which, along with the first byte of the add 
instruction, cause the program to jump to the byte after the DB. The purpose of this 
snippet is to confuse the cracker, and in the process obfuscate six bj^es. Although 

25 unlikely, to avoid collision problems, the me instruction should be switched to jmp. 



3B EB cmp ebp,ebx 
04 00 addal,Oh 
75 FB jne $-5 
83 DB 083h 



The object of these are only to obfuscate code. They are classified as 
'petty obfuscators' because it would be more suitable to reuse a 'great obfUscator.' 
To obfuscate four bytes: 



jmp $+4 ; Note: this may need byteswapping 
DD 0660FBCA311 ,BSF SP [REG+4bytes ] 

To obfuscate five bytes: 

jmp $+4 

DD 0660FBAA3h ;BT WORD PTR [REG+4bytes] , 1 byte 
To obfuscate six bytes: 
jmp $+4 

DD 0660FBAA4h ;BT WORD PTR [REG*4+REG+4bytes], Ibyte 

Although the present invention has been discussed with respect to specific 
embodiments, these embodiments are merely illustrative, and not restrictive, of the 
invention. The scope of the invention is to be determined solely by the appended claims. 



WHAT IS CLAIMED TS: 

1 1 . A method for obfuscating computer program instructions upon 

2 disassembly, the method comprising 

3 inserting an obfuscating instruction for causing a disassembler to 

4 not disassemble one or more bytes subsequent to the obfuscating instruction; and 

5 inserting a branch instruction to invoke execution of the one or 

6 more bytes subsequent to the obfuscating instruction. 

1 2. The method of claim 1, wherein two or more of the obfiiscating 

2 instructions are used adjacently to increase the number of the one or more bytes. 

1 3. The method of claim 1 , wherein the obfuscating instruction is an 

2 INT instruction. 

1 4. The method ofclaim 3, including the step of 

2 inserting the following code: 

3 JMP $+4 

4 INT 35h 

1 5. The method ofclaim 1, wherein the steps are performed 

2 manually. 

1 6. The method of claim 1, wherein the steps are performed by a 

2 software process. 

1 7. The method of claim 6, wherein parameters are supplied to the 

2 software process, the method further comprising 

3 supplying a parameter to the software process to specify the 

4 fi-equency that an obfuscating instruction is to be inserted in a predetermined program. 
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8. The method ofclaim 7, wherein the frequency is specified as a 
nimiber of instructions of the predetermined program between each insertion of the 
obfuscating instruction. 



1 9. A computer-readable media including the following instructions 

2 executable by a processor: 

3 an obfuscating instruction for causing a disassembler to not 

4 disassemble one or more bytes subsequent to the obfuscating instruction; and 

5 a branch instruction to invoke execution of the one or more bytes 

6 subsequent to the obfuscating instruction. 

1 10. A computer-readable media including the following 

2 instructions executable by a processor: 

3 JMP $+4 

4 INT 35h 

1 11- A computer-readable media including the following 

2 instructions executable by a processor: 

3 JMP $+4 

4 INT 35h 

5 INT 20h 

1 12. An apparatus for obfuscating computer program instructions 

2 upon disassembly, the apparatus comprising 

3 an instruction for causing a disassembler to not disassemble one or 

4 more bytes subsequent to the obfuscating instruction; and 

5 a branch instruction to invoke execution of the one or more bytes 

6 subsequent to the obfuscating instruction. 
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SYSTEM FOR OBFUSCATING COMPUTER CODE UPON DISASSEMBLY 

ABSTRACT OF THE DISCLOSURE 
A system for preventing accurate disassembly of computer code. Such code 
masking, referred to as "obfuscation," is useful to prevent unwanted parties from making 
copies of an original author's software, obtaining valuable information from the software for 
purposes of breaking into a program, stealing secrets, making derivative works, etc. The 
present invention uses assembly-language instructions so as to confuse the disassembler to 
produce results that are not an accurate representation of the original assembly code. In one 
embodiment, a method is provided where an interrupt, or software exception instruction, is 
used to mask several subsequent instructions. The instruction used can be any instruction 
that causes the disassembler to assxune that one or more subsequent words, or bytes, are 
associated with the instruction. The method, instead, jumps directly to the bytes assumed 
associated with the instruction and executes those bytes for a different purpose. A preferred 
embodiment works with a popular Microsoft "ASM" assembler language and "DASM" 
disassembler. The instructions used to achieve the obfuscation include "INT" instructions. 
Using this approach up to 17 bytes of obfiiscation can be achieved with five instructions. 
Each instruction remains obfuscated until executed and returns to an obfuscated state 
afterwards. 
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